PORSCHE: Performance ORiented SCHEma mediation

نویسندگان

  • Khalid Saleem
  • Zohra Bellahsene
  • Ela Hunt
چکیده

Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. In this paper we present a method, which creates a mediated schema tree from a large set of input schema trees and defines mappings from the contributing schemas to the mediated schema. It is a two-phase approach. First, we use a set of linguistic matchers, which extract the semantics of all distinct node labels, present in input schemas, and form clusters of semantically similar labels. Second, we use a tree-mining data structure, combined with the similar label clusters, to calculate the context of each node, which is used in mapping. Since the input schemas are trees, our tree mining algorithm uses node ranks calculated by pre-order traversal. Tree mining combined with semantic label clustering minimizes the target search space and improves performance, thus making it suitable for large scale data sharing. We report on experiments with up to 80 schemas containing 83,770 nodes. PORSCHE took 587 seconds to match and merge them to create a mediated schema and to return mappings from input schemas to the mediated schema. We compare the quality of matching of PORSCHE with COMA++ on standard XML schemas, and find them to be very similar to the mappings produced by COMA++.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PORSCHE: Performance ORiented SCHEma Matching

Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. In this paper we present a method, which creates a mediated schema tree from a larg...

متن کامل

Mediation Queries Adaptation After the Removal of a Data Source

A broad variety of data is available in distinct heterogeneous sources, stored under different formats: database formats (in relational and object-oriented models), document formats (SGML/XML), browser formats (HTML), message formats, etc. The integration of such data is increasingly important for modern information systems applications such as data warehousing, data mining, and web application...

متن کامل

Data and Process Mediation Support for B2B Integration

In this paper we present how Semantic Web Service technology can be used to overcome process and data heterogeneity in a B2B integration scenario. While one partner uses standards like RosettaNet for product purchase and UNIFI ISO 20022 for electronic payments in its message exchange process and message definition, the other one operates on non-standard proprietary solution based on a combinati...

متن کامل

The Effectiveness of Integrated Schema Oriented Therapy and Young’s Schema Therapy on Perception of Exclusion among individuals with Borderline Personality Characteristics

Background& Aims: Personality pathological symptoms are the ones that require the attention of psychological therapists. Borderline personality characteristics due to its significant prevalence, as a personality trait, require the attention of therapists. Accordingly, the aim of this study was to determine the effectiveness of integrated schema oriented therapy and schema therapy on perception ...

متن کامل

Prediction of Couple Adjustment Based on Disconnection and Rejection Schema with the Mediation of Alexithymia an in Married Elementary School Teachers of Qom City

Introduction: Today, attention to the family and its role in the establishment of a healthy society has been given more attention by psychologists and researchers. One of the most important issues in couple adjustment is alexithymia. Therefore, the aim of the present study is prediction of couple adjustment based on disconnection and rejection schema with the Mediation of alexithymia an in marr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2008